Skip to content

Bug 1917484: Don't adopt after clean failure during deprovisioning#122

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
honza:cherry-deprov-failure-adopt
Feb 1, 2021
Merged

Bug 1917484: Don't adopt after clean failure during deprovisioning#122
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
honza:cherry-deprov-failure-adopt

Conversation

@honza
Copy link
Copy Markdown
Member

@honza honza commented Feb 1, 2021

This is meant to supersede #121. It adds the commit Ironic: Add result functions without which it cannot be built.

zaneb added 2 commits February 1, 2021 09:17
Add a set of functions that lend semantic meaning to the various Result
values that can be returned - some of which result in identical Result
structures being returned to the caller. This is the first step toward
improving the provisioner API to allow the controller more insight into
how it should respond to events in the provisioner.

The six possible events are:

* A conflict that should be retried with a delay
* A change to the Host Status that requires it to be written back to the
  k8s API
* Waiting for an ongoing process
* Successful completion of the current operation
* A failure of the current operation
* A transient error (e.g. network connection failure)

(cherry picked from commit 0e1acfe)
Signed-off-by: Honza Pokorny <honza@redhat.com>
During deprovisioning of a Host, if 'deleting' (i.e. deprovisioning) the
node succeeds (i.e. it doesn't go to the Error state) but the automated
cleaning that follows fails, the only way to recover is to return the
node to the manageable state.

Previously, once in the manageable state we would attempt adoption on
the node so that we could deprovision again. However, in the course of
'deleting' the node, the image information is cleared from it so it
cannot be adopted again. (Adoption continues to be the right thing to do
if the node has just been re-registered due to the Ironic database being
recreated, and in that case the image information is present since it
gets added during the initial registration.)

To work around this, don't attempt to adopt during the Deprovisioning
state if the node is manageable and the image data is not present.
Handle the manageable state in Deprovision() by declaring the
deprovisioning complete.

A node in the manageable state cannot be re-provisioned without first
being cleaned - it must go through cleaning to reach the available state
before it can be provisioned. Provisioning already handles nodes in the
manageable state, as this is how they begin after the initial inspection
of the host before the first provisioning (which does the initial
cleaning).

(cherry picked from commit ba38688)
Signed-off-by: Honza Pokorny <honza@redhat.com>
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Feb 1, 2021
@openshift-ci-robot
Copy link
Copy Markdown

@honza: This pull request references Bugzilla bug 1917484, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1917484: Don't adopt after clean failure during deprovisioning

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Feb 1, 2021
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 1, 2021
@andfasano
Copy link
Copy Markdown

/lgtm

@openshift-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andfasano, honza

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 1, 2021
@openshift-merge-robot openshift-merge-robot merged commit e2aaee4 into openshift:master Feb 1, 2021
@openshift-ci-robot
Copy link
Copy Markdown

@honza: All pull requests linked via external trackers have merged:

Bugzilla bug 1917484 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1917484: Don't adopt after clean failure during deprovisioning

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants